An Efficient Solvent Accessible Surface Area calculation applied in Ab Initio Protein Structure Prediction
نویسندگان
چکیده
Knowing the structure of proteins is an essential step in developing new medicines. This is a very time consuming and expensive process. Researchers from many different areas are trying to find new and efficient ways to discovery protein structures. This is known as the Protein Structure Prediction (PSP) problem and it is divided into experimental and in silico methods [3]. In this work, we use the in silico approach. The program is called ProtPred [4], which is an Evolutionary Algorithm (EA) that looks for feasible protein configurations in the search space from their amino acid sequences [5]. We know that the bottleneck of EA is the fitness function, which is used to evaluate solutions. The fitness function of ProtPred is composed by the sum of bonded and non-bonded energies. However, in this work, we used only van der Waals energy and Solvation energy, which also need to compute the Solvent Accessible Surface Area (SASA). We know that these two functions are time consuming processes, as they need to compute a pair-wise interaction among the atoms. Each of these energies has a specific contribution. For example, van der Waals energy tends to attract atoms at a certain distance and avoids atoms overlapping with each other in short distances, while the SASA maintains the compactness of structure. The time needed for a single fitness function call with van der Waal and Solvation energy by ProtPred is relatively fast. In order to look for a promising solution, the fitness function must be called hundreds of thousands of times, even for a small protein, i. e the bottleneck of any EA for PSP is, in general, the fitness function. For this reason, we need to use fast and efficient van der Waals and Solvation energy. This will allow us to accelerate the whole ProtPred that enables it to predict more and larger protein configurations. We demonstrated in a recent journal an efficient way of computing van der Waals energy, and we applied it to PSP using cell-list algorithm [1]. We also showed how to efficiently compute SASA using neighbor lists Proceedings IWBBIO 2014. Granada 7-9 April, 2014 575 2 Daniel Bonetti, Horacio Pérez-Sánchez, and Alexandre Delbem with Graphic Processing Units (GPU) based method, called MURCIA [7]. Both works showed a complexity reduction from O(n) to O(n), rendering a significant time reduction. In this research, we replaced the old O(n) Solvation energy from ProtPred with the new SASA implementation provided by MURCIA. Both CPU and GPU implementations of MURCIA were used, allowing us to make time comparisons between the techniques. Both versions were able to produce the same SASA value. In order to compute the implicit interaction of the protein with solvent, we related the Accessibility Solvation Parameters (ASP) of Carbon, Nitrogen and Oxygen atoms [6] to their SASA calculated by MURCIA. The sum of ASP values times SASA gives the Solvation energy used in the fitness function of ProtPred together with van der Waals energy. The experiments were performed in an Intel Xeon E5506 with a NVIDIA Tesla C2075. Three different times were measured: one for van der Waals energy, another for Solvation energy and the remaining time for the EA (initialization, population generation, composing new solutions etc.). It performed six runs of ProtPred with MURCIA in CPU, plus another six in GPU. Five proteins were chosen from PDB, ranging from 25 to about 1,000 residues, in order to evaluate the time and the prediction quality of the range of proteins. The convergence criterion adopted in EA was limited to one million evaluations. The running time of Solvation energy produced better speedups for all sizes of proteins that we tested. The smallest protein (with 25 amino acids) had a calculated speedup of 2.1, and the largest protein (with 971 amino acids) had a speedup of 26.5. This is approximately 3% of the running time of Solvation energy in CPU. The running time of van der Waals energy and the remaining parts of EA, were kept constant in both versions in which Solvation energy was computed in CPU and GPU. This also shows that computing Solvation energy in GPU and CPU does not cause any noise in the remainder of the algorithm. Also, it is possible to notice that the speedup line for Solvation energy in GPU( in relation to CPU version) grows linearly according to the number of atoms. This result is of interest, since we are now able to linearly increase the speedup using only one processor and one GPU, according to the number of atoms. The smallest protein (with 25 residues) had the best RMSD, measured at 0.942, and the second best (with 50 residues) had a RMSD of 5.088. For proteins above 50, the RMSD was not as good as these small proteins. In order to work with non-small proteins, the parameters of ProtPred should be calibrated first. This is likely to be necessary to increase the number of evaluation functions. Initially, we ran the experiments for timing comparison purposes. Thus, from this point, it will be possible to properly calibrate the EA by using fast Solvation energy
منابع مشابه
SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles
Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure predi...
متن کاملAntimalarial Activity of some Conjugated Arylhydrazones: Ab Initio Calculation of Nuclear Quadrupole Coupling Constants (NQCC)
“Malaria” is a life-threatening blood disease in tropical regions that spreads by the bite of the Anopheles mosquito. Antimalarial medications are designed to cure or prevent this infection, and prosperous achievements in this area mostly depend on the knowing the drug-receptor interactions and active sites of medicine. This improvement can be achieved through understanding the electronic struc...
متن کاملThe effect of Environmental exposure to some chemical solvents on DPPC as important component of lung surfactant: an ab initio study
One of the main components of lung alveoli is surfactant. DPPC (Dipalmitolphosphatidylcholine) is thepredominant lipid component in lung surfactant that is responsible for lowering surface tension in alveoli in thisarticle. We used a very approximate model with computational method of Ab initio to describe the interactionsbetween DPPC as important component of lung surfactant and some chemical ...
متن کاملPrediction of cyclin-dependent kinase 2 inhibitor potency using the fragment molecular orbital method
BACKGROUND The reliable and robust estimation of ligand binding affinity continues to be a challenge in drug design. Many current methods rely on molecular mechanics (MM) calculations which do not fully explain complex molecular interactions. Full quantum mechanical (QM) computation of the electronic state of protein-ligand complexes has recently become possible by the latest advances in the de...
متن کاملSolvent accessible surface area approximations for rapid and accurate protein structure prediction
The burial of hydrophobic amino acids in the protein core is a driving force in protein folding. The extent to which an amino acid interacts with the solvent and the protein core is naturally proportional to the surface area exposed to these environments. However, an accurate calculation of the solvent-accessible surface area (SASA), a geometric measure of this exposure, is numerically demandin...
متن کاملImproving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning
Direct prediction of protein structure from sequence is a challenging problem. An effective approach is to break it up into independent sub-problems. These sub-problems such as prediction of protein secondary structure can then be solved independently. In a previous study, we found that an iterative use of predicted secondary structure and backbone torsion angles can further improve secondary s...
متن کامل